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3 Modules Connected to the NML Structures Generated by XPP-VC $ 

1 Introduction 

™P ] c ° r ?: * e ; optimized NML modules/macros to implement a certain operation, Junction or task; 
cmbeembed<tedmtheCsoarcecoaeofagiveuappU^ Figure 1 shows the compilation flow when 
NMLmoduIes are used in a C program. Two types of integration schemes are currently supported by 
tne XFP-VC compiler [ 1]. One refers to the integration of NML modules that release its resources after 
computing and interface to the C program via array variables (memories). The other approach refers to 
JNML modules, which do not release its resources and interface to die C program via scalar variables. 
The name of each NML module must start with "XPP_" and there muse exist an NML file with the same 
name where the module is defined. A C header file ( • . h»), where each module's interface is declared, 
can be used (other way is to specify the interface declaration in the C program). Internal memories 
osed by the NML module must be pte-placed and the placement information must be declared in the 
mterface declaration. Special pragmas are used to declare=the positions of the memories on the XPP [3]. 
Table 1 shows the pragmas supported and Table 2 shows some pragmas that will be considered in future 
improvements. The interface specification between the C code and the NML module must be proceeded 
by the pragma identifying the module or an instance of the module (Spragma module "<narae>"). 
Memories used only on the scope of the module must be also declared using Spragma IKAM <x> , 
<y> without a name. ! < 
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Tabic 1: Pragmas used to specify the interface between C code and NML modules. 



Reference syntax 


Comments 


Cunrent Support 


NML modules 


#pragma MODULE 
<"name"> 


define that name refers to 
an NML module or to an 
instance of an NML mod- 
ule " "~ 


yes 


specific instances 
or iNjyu* inouuieo 


#pxagma place < M name* < > 


manually placement of an 
instance of an NML mod- 
ule. When the posi- 
tion specified refers to 
an IRAM the compiler 
marks such IRAM as used 
by the current configura- 
tion. 

The name is not neede 
when the pragma is used 
after the module XXXVOC&- 
don. 


yes 


NML modules 


#pragma inline <"name l4 > 


instructs the compiler to 
instantiate die NML mod- 
ule without creating aspe- 
cial configuration for this 
module 


yes (can be used in 
the first approach) 


array variables In 
internal memories 


^pragma IRAM 
<X>,<Y> <array name> 


"X" and **Y" define the 
IRAM used to accomo- 
date the array 


yes 


internal memories 
used by the NML 
core internally 


#piagma IRAM 
<X>,<Y> 


"X* 4 and "Y M define the 
IRAM used 


yes 


constants 


#pragma CONST 
<value> 


the^order of the declara- 
tionraust be the order of 
the constants in the NML 
module. 


yes 
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xmap j 




r 



<1<PP&nary Code^> 

Figure I : Compilation flow when integrating NML modules in C programs. 



Reference 


| syntax 


:e oetween rwuu modules an 
| Comments 


d C code (cone). 
Current Support 


array variables in 
external memories 


#pragma EXRAM <Z> 
<array narae> <base ad- 
dressy <size> 


"Z" identifies the I/O port 
used to interface to the ex- 
ternal RAM where die ar- 
ray is located 


no (requires XPF^ 
VC support to 
specify the base 
address of each 
array mapped to an 
external memory) 


external memories 
used by the NML 
com internally 


#pragma EXRAM <Z> 
<base address>, <size> 


"Z M identifies the I/O port 
used to interface to the ex- 
ternal RAM 


no (requires XPP- 
VC support to 
specify the base 
address of each 
array mapped to an 
external memory) 


I/O ports ~ ~— 


#pxagma IN 1 OUT 1 IN- 
OUT<Z><name> 


declare an I/O port at 
position "Z" as input 
(IN),, output (OUT), or 
inputfoutput (INOUT). 
Assign a name to that I/O 


no (should be spec- 
ified in the inter- 
lace, but not used 
by XPP-VC Used 
as a form of docu- 
mentation) 
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2 Modules not Connected to the NML Structures Generated by XPP-VC 

To instantiate a module the user must use the module's name like a normal C function call. The declara- 
tion of the module is done as a normal C function plus the pre-defined C pragmas to specify the interface. 
For each array variable in the function's argument list mapped to IRAMs, a declaration of the location of 
the IRAM that contains it must be presented using #pragma IRAM <x>, <y> <array name>. 

In this approach, the compiler only supports as arguments array variables. However, constants (which 
can be used to communicate panunecerizable features: identification of an I/O port, for instance) can be 
specified using the pragma CONST. For internal RAMs, we assume a one-to-one mapping (each internal 
memory IRAM contains a single array variable). The place and CONST pragmas can be used after 
the invocation of the module in the C program. In this case the programmer does not need to use the 
-MODULE pragma (see Figure 5)1 " 

Each module called in the C code is automatically embedded in the NML output file. By default, the 
compiler generates one configuration for each call. If instructed by die inline pragma NML modules 
can be embedded in the same configuration in conjunction with structures generated from C code (note 
that this option can only be used with independent modules, which must be also independent from the C 
segments of code existent in the same configuration and it is a scheme to include concurrent taste in the 
same configuration 1 ). Each module used in the C code mast self-release its resources after completion 
of computation. Each module must have only one configuration. Integration of NML applications with 
more than one configuration into C code must be explicitly done by integrating each of the modules 
(configurations) individually. 

Consider the DCT algorithm shown in Figure 2. It consists of two 8x8 matrix multiplications. As- 
suming the existence of an optimized NML module to perform the multiplication of two square matrix, 
the user can re-program the algorithm using the optimized module (see source code in Figure 3). The 
XPP_nexfc_conf in comments in Figure 3 illustrates the configuration boundaries that will be auto- 
matically inserted by the compiler to furnish one configuration for each module invocation. The interface 
declaration for the XPP_jnat__mul module can be seen in Figure 4, Figure 4(a) shows the definition of 
N (number of rows or columns of the matrix) which is used to parameterize the NML module (see Figure 
4(b)). 

In this example, the XFP memory resources are statically pre-defined for each NML module. Thus, to 
transfer different array variables to distinct instances the user must explicitly copy array elements to the 
array variables that will be used to communicate data between the program and die NML module (see 
Figure 3). Thus, all instances in the code of a specific NML module with memories in fixed positions 
must use die same list of array variables as argument's list. 

Another scheme is the use of parameterized memory positions. In this case, for each invocation of the 
NML module in the C code, the memory positions mustbe defined according to the initial positions of 
the internal memories for each array variable (Figure 5 shows a DCT example using a parameterized 
module to do the matrix multiplication). In this case no movement/relocation of data between internal 
memories is necessary. 

Initialization of array variables in the declarative section of the C program (e.g., intQ a = { 1, 4, 1);) 
and used by an NML module is inserted in the first configuration of the application. However; examples 
<Note that it must be ensured that the same PAE is not used by diffident independent designs in the same module. 
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// file det.c 

m m m 

// do imm x CosBlock(T) 
for(i=0; i<N;i++) { 
for {j=0; j<M? j-M-) { 
tmp = 0; 

for { k=0 1 k<N; k++ ) 

tmp inXm[i*N+k] * CosBlock [k+j*fir] ; 
TerapBlock(i*N+j] a t«tp; 

) 

//do CosBlock x TenipBlock 
£or{i=0; i<N;i++) { 
for(3=0;jiN/j++i. { 

tmp = 0; ~" 
for (k=0 ; k<N; k++ ) 

tmp += T^apBXock{k*M+j] * CosBlock ti*N+k] ; 
OutIm(i*N+j J« tmp; 

) 

) 



Figure 2: C code for a DCT implementation based on matrix multiplications. 

where die NML module is the first configuration in the applicatioa require xmap support (this feature is 
planned). 

3 Modules Connected to the NML Structures Generated by XPP-VC 

Another possibility is to embed and interconnect NML modules with the NML generated by the XPP-VC 
compiler. In this case there can exist interconnections between scalar variables of the C code and ports 
of the NML modules. This is done by using C streets (each struct must have a field with the same name 
as the related NML module, which must start with "XPP.J') to define each NML module and two special 
functions: XPP_gecmacro and XPP_putmacro. They are used to connect variables of the C code to 
the input/output ports of the NML module. 

Figure 6 shows a segment of code using an NML module to do integer division ("XPP_div"). Figure 7 
shows the header of the XPP_<aiv module and a segment of the NML code generated by the compiler 
using an instance of that module. Figure 3 shows how to share the same instance of an NML module and 
Figure 9 shows how to use more than one instance of the same NML module. 

Figure 10 presents another example: a C program using XPP FIFOs. 

Note that the interface declarations atributed to an NML. module instance override previous possible 
declarations atributed to the name of the NML module. 

When it is not possible to synchronize NML module instances with their I/O port connections in a 
dataflow scheme, an end of completion signal can be used (since it is not possible to declare event 
signals in C, an integer type must be used): 



r.«( 
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^include *XPPlib.h* 

const dint CosBlocklM] = {...}; 

const int cosTrans[M] = {..-}; // the transpose of CosBlock 
// XPP^next^conf { } ; 

XPPjiafc_mulfc (inlm, CosTraas, TempSlock}* //matrix multiplica- 
tion in NHL 
/ / XPP_riext__COii£ ( ) ; 

// copy the values to the arrays used as arguments of XPP_Kiat_mult 
£or(i»0;i<M; i++) < 

lnlm[i] = CosBlock[i]; 
• CosTransMl 55 TempBlockCi] ; 
> ~ - 

/ / XPP_next_con£ < ) ; 

XPP^aat; jnuit ( laXn, CosTrana, TempBlock) ; //matrix multiplica- 
tion in NHL 
// XPP.jiext^conf ( ) ; 



Figure 3: A DCT implementation using NML modules to perfonn the matrix multiplications. Each NML 
module will use a different configuration. 

// file XPPlib.hf (a) 
* • • 

// The multiplication of two quadratic matrix in NML, function: 
void XPP_mafc_jmult(inc All, int B[], int C[3); 

// The specification of the arguments of the NML module 
#pragma MODULE *XPP_mat - mult Jf // identify the module names us- 
ing "<name>" 

^pragma IRAM 1,0 A // IRAN <X>,<Y> <array name> " 
^pragma IRAM 1/1 B 
#pragma IRAM 1,2 C 

^pragma CONST 8 //number of elements in each row and col- 
umn of the square matrix 
// other modules: 



// file del:. amis (h) 

• ■ m 

INCLUDE ■ XPP_mat_pmlfc . nml " 

MODULE MOD2 { 

OBJ ol: XPP_mat_mult [B] { ) 

} 



Figure 4: (a) Interface definition for the NML module: XPP^matjnult ; (b) segment of the NML file 
generated by the XPP-VC related to the second configuration. 
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#include "XPPlib.h - 

const int CosBlockLM] = (,..}; 

const int CosTrans[M) = //the transpose of CosBlock 

m * m 

XPPjaat^mult (rnXm, CooO?rans # TenpBlocX) ; //matrix multiplica- 
tion in NML 
#pragma place 0,0 

#pragma CONST l //x position of memory with inT^ 
#pragma CONST 0 //Y position of memory with xmm 
#pragma CONST 1 //x position of memory with cosTrans 
#pragma CONST 1 //Y position of memory with CosTrans 
#pragma CONST 1 //X position of memory with TempBlock 
#pragma CONST 2^ //Y position of memory with TempBlock 
tpragma CONST 8 //number of row and column elements 
XPPjna*_jttult (CosBlock, *e»pBloefc, Outlm) ? //matrix multiplica- 
tion in NHL 
#pragma place 0,0 

^pragma CONST 1 //X position of memory with CosBlock 
#pragroa CONST 3 //Y position of memory with CosBlock 
ftpragma CONST 1 //X position of memory with TempBlock 
#pragma CONST 2 //Y position of memory with TempBlock 
#pragma CONST 1 //x position of memory with Outlm 
^pragma CONST 0 //Y position of memory with Outlm 
#pragma CONST 8 //number of row and column elements 



Figure 5: DCT example using NML modules to perform the matrix multiplications. Each NML module 
wiU be mapped to a different configuration- In this case an NML module with parameterized memory 
positions is used. 



int endLmod; 
• » » 

XPP_pufcmacxo (modi . a , a ) 7 
do { 

XPPjotwacxo (modi . end, ftendjnod) ; 
} while ( end_jnod == 0); 



In the example above, a connection is done between the C variable a and the port a of the instance modi 
of an NML module. After that the program wait for the completion of the execution of the instance 
which is signaled by a value different of zero in the output port end of the NML module instance. 
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// original C coda (file arraydiv.c) s (a) 

for(i=0;i<K; ( 
C[i] = Ati]/B[il? 

} 

• • ..... 

// C code using; an HBO- module to perform the division: (b> 
#include "XPPlib.h* 

mm* 

divMOD divl; 

• ■ m 

for(i^o?±<M; i*+) ( 
ai » A[i]; 
bi = B(i]; 

XP9_pu tmacro ( divl . a , ai ) ; 
XPP_putmacro ( divl .b, hi) ; 
XFP_^efcmaca?a ( divl . c , &ci) / 
CCi] - ci; 

> 



// file XPFlib.hs (O 

// The declaration of the NML module to do integer division 
// XPP_div.nml is the name of the NML file where the module 
// XPP_div is defined 
// eanqputes c =■ a/b 
typedef struct { 

i_nt XPP_div //indemnifies the name of the NML module 

int a, b, c; 
} divMOD; 

^pragma MODULE *XPP_div» 

#pragma IBAM 1,0 //it uses an internal RAM in position X=1,Y=0 
■ p • 



Figure 6: Example of embedding an NML module, which performs integer division, into C code: (a) 
example in C; (b) the same example in stylized C; (c) the description of the NML module in the library. 
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// file SP9.div.znaX s (a) 

// The integer division in NML: 
MODULE XPF_div(DIN a, DIN b, DOUT C) { 
// NML code to perform the division 

• mm 

} 



// file arraydiv.nail: (e> 

INCLUDE *XPP_div.nml- 
• • • 

MODULE, ex { _ 

// NML code generated by the xipp- 
VC to interface to the NML module 
OBJ divl: NML_div {} 

divl.a as <object generated by XPP-VO.X 
divl. to = <object generated by XPP-VC>.X 
<object generated by X£P~VC> - <A | B> = divX.c 

) 



Figure 7: Example of embedding an NML module, which performs integer division, into C code (conL): 
(d) the NML module; (e) the NML module integrated in the NML code of the design. 

[2] PACT XPP Technologies Inc., Germany, "NML Reference: Release 2.0," Technical Report, April. 
* 2001. 

[3] PACT XPP Technologies, Inc., Germany, "The XPP White Paper: A Technical Perspective," Tech- 
nical Report, March, 2002. 
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// original c code (file arraydlvi.c) s (a) 

for{i«0;i<M; i+=2) { 

c(ij m A"[i]/J&'[il;. « 

Cri+1] = A[i+lJ/Bti+l]; 

) 



// C code using a aharadL BJML modula to perform botn divisions: (b) 
^include "XPSlib.h* 

divMOD divl; 

for(i=0;i<M; i++) { 
ai - Ati] 
bi « B[i] ; 

2£P_putaacro ( divl . a , ai ) ; 
OTP-Jputwacro(divl.l>, bi) ; 
XPP_g etanacro ( divl . c , &ci ) ; 
Cti] = ci; 
ai a A[i+1] ; 
bi = Blifllr 

XFB jmtunacro ( divl . a , ai); 
XPP^jputaacro ( divl . b , bi) ; 
XPP_getaiacro (divl . c , &ci) ; 
C[i+1] = ci; 



Figure 8: Example embedding more than one NML module instance, which performs integer division, 
into C code: (a) example in C, which uses two dividers; (b) the example using one NML instance of the 
divider to perform the two divisions- 



MDKT-2002 145 16 



PAT.-ANW. P. PIETRUK 



.+49 721 469308 



S.20 





// C cad© using two instances o£ tha NML DXVXDERs (c) 
#include -XPPlito . ft- 

divMOD divl; 

#pragma place "divl" 0,0 
pragma MODULE "divl" 

#pragma I RAM 1,1 //the IRAM used when the module is placed in 0 f 0 
divMOD div2; 

#pragma place "diva" 0,6 
#pragma MODULE •dxv2" 

#pragma IRAM 1,7 //the IRAM used when the module is placed in 0, 6 

for(i=0;i<M; { 
ai » A[i]; 
hi = B[l]y 

XPP_putmacro (divl . a , ai ) ; 
XPP_jcrtxtmaero ( divl . b , hi) ; 
XPP^getaoacxo ( divl . c , &ci) ? 
C[iJ - ci; 
ai » ACi+1]; 

hi » B[i+1]; 

XPP_put2aacro(div2 .a, ai) ; 
XPP«Ptttanacro{div2*b, hi); 
XSP^atmacro ( div2 . c , &ci ) ; 

C[i+1] « Ci; 



Figure 9: Example embedding more than one NML module instance, which performs integer division, 
into C code (cont): (c) the example using two NML instances of the divider. 



} 
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// C coda uaiag XPP FIFOS (file ox.c) : (a) 
#include "XPPlib.h* 
■ * • 

FIFO fiifol; 

#pragma place *fifol- 1,0 //place the fifo in position (1/0) 
FIFO fifo2; 

#pragma place w fifo2- 1,1 //place the fifo in position (1,1) 

* - • 

main ( ) { 

int input_sampl£_real, delay^value^J^delay^value^; 
int foundLflag = 0; 
int zero. counter = 0; 

while (1) { 

XFP_gets tream ( 1 , 0 , &input_sample_real ) ; 
XF»_putmacro ( f if ol • in, input_sample_real > 7 
X3?P_getmacro ( f if ol - out , &delay_value_l ) 
XFF^putmacro ( £i£o2 . in, delay_valuel ) ; 
XP* ^getmacro ( f i f o2 . out , &delay_value_2 ) ? 

if ( (input_sample_real + delay_value__l * 
delay_value_l - delay_value_2) =- 0) { 
zero_co\rnter++ ; 
if (zero^counter ==64) { 
found^f lag=l ; 
zero__counter = 0/ 

) 

> 

XPP_putstreaiA<4, 0, found^f lag) ; 

) 

) 



// file XPPlib-h: (b) 
• • • 

// The declaration of the XFP FIFO; 
typedef struct C 

int XFPJFIFO // identifies the name of the tfKL module 

int in, out? 
} FIFO; 



Figure 10: Example of a C program using an XPP HFO^ (a) example in C; (b) the description of the 
NML module in the library. 
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// £il« XPPJFXFO.aaals (c) 

• MODUI*E XPP.JFIFO<DIN 'in, DOUT Ottt) { 

OBJ fifoX: FIFO (1 0,0 { // relative position used 
Xtf = in 

> 

out « f if OX. OUT 



// £ila ox.nmls <d) 
INCLUDE *XPP_ - FIFO.nral" 

OBJ fifoX s XPPJFTFO d S1,$0 { } 

fifol.in = <object generated by XPP~VC>.X 
<object generated by XPP- VC> . <:A j B> * fifol.out 
OBJ fifo2 r XPP.JFIFO @ < } 

fxfo2.in - <object generated. by XPP-VO.X 
<object generated by XPP-VG>-<A | B> = fifo2.out 



Figure 1 1: Example of a C program using an XPP FIFO (cont.): (c) the FIFO in NML (relative position 
is used); (d) the section of the final ex.nml file with the FIFO's instantiations and connections. 
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